Interphase chromosomes of the Aedes aegypti mosquito are liquid crystalline and can sense mechanical cues

We use data-driven physical simulations to study the three-dimensional architecture of the Aedes aegypti genome. Hi-C maps exhibit both a broad diagonal and compartmentalization with telomeres and centromeres clustering together. Physical modeling reveals that these observations correspond to an ensemble of 3D chromosomal structures that are folded over and partially condensed. Clustering of the centromeres and telomeres near the nuclear lamina appears to be a necessary condition for the formation of the observed structures. Further analysis of the mechanical properties of the genome reveals that the chromosomes of Aedes aegypti, by virtue of their atypical structural organization, are highly sensitive to the deformation of the nuclei. This last finding provides a possible physical mechanism linking mechanical cues to gene regulation.

Pearson's correlation and contact probability as a function of the genomic distance. [2][3][4][5] The curves were obtained for each chromosome extracted from the Aedes aegypti full nucleus simulation. The red curves on A, C, and E correspond to a correlation baseline (comparing chromosomes 1 and 2 from experimental Hi-C maps) that highlights the modeling correlation (comparing the simulated and experimental Hi-C maps) that is higher when compared to a baseline as a function of genomic distance. The red curves on B, D, and F correspond to a correlation baseline (comparing chromosomes 1 and 2 from experimental Hi-C maps). The modeling correlation (comparing the simulated and experimental Hi-C maps) is significantly higher than the red baseline curve as a function of genomic distance. The labels 1M and 1P in the main text are used for distinguishing between the two copies of the same chromosome when simulating the full nucleus. C1 is presented here as data obtained for the chromosome without distinguishing between homologous chromosomes.

Supplementary Discussion
Comparison Between Human and Mosquito Genomes Using ACA Score In our recent work, 6 we performed in situ Hi-C on 24 species, and the architectural features observed in those maps can be divided into two groups, type-I and type-II. Type-I group (that includes the Aedes aegypti) incorporates Rabl-like configurations such as centromere clustering, telomere clustering, and telomere-to-centromere axis. Type-II group (human and other mammalians) includes the genome organization related to chromosome territories.
These architectural features are identified using a scoring parameter called ACA (Aggregate Chromosome Analysis). ACA aggregates the signal from all intra-and inter-chromosomal contacts, and the scores are defined as observed-to-expected ratios (see Supplementary Information ref 6 for details). The ACA chromosome territory score S(ct) for the Aedes aegypti is 1.109 which is low when compared to 11.150 from the Human genome or 6.179 from the Tammar wallaby chromosomes (mammalian). These ACA values defined the Human and Tammar wallaby genomes to belong to the type-II group which forms territories. On the other hand, the low value of S(ct) includes the mosquito in group type-I (see Supplementary   Tables S3 and S8 of ref 6 ).
In addition to this ACA score, we compare the contact probability as a function of the genomic distance for chromosome 1 from Aedes aegypti and human cell line GM12878 7 Supplementary Fig. 4: Contact probability as a function of genomic distance obtained from experiments. The solid red curve represents the data extracted from the Hi-C maps for chromosome 1 of Aedes aegypti. 8,9 The dashed blue line represents the data obtained from the human chromosome 1 from the GM12878 cell line Hi-C map. 7

Comparison Between Different Matrix Balancing Methods
The Hi-C map used in the model training was balanced using the KR -Knight-Ruiz 10 algorithm that is implemented in Juicer software package. 11 In order to address the possible artifacts generated by the normalization, we calculated the contact frequency as a function of the genomic distance for the Aedes aegypti chromosome 1 using three different normalization methods available in Juicer and raw data (un-normalized). Supplementary Fig. 5 presents the scaling curves that do not present significant variations when comparing different normalization methods. We can then assume that there is no over-or underestimation of shortor long-range contacts due to matrix balancing.
Supplementary Fig. 5: Contact probability as a function of the genomic distance of Aedes aegypti chromosome 1 extracted using different normalizations.

Correlation Between A/B Compartments and High/Low ATAC-seq Intensity Values
We use the first component of the eigenvectors extracted from the correlation matrix of the experimental Hi-C map to determine the A/B compartments. The ATAC-seq signals are divided into two groups based on the signal intensity, High and Low. The group denominated High contains signal values above the 95th percentile (colored in green - Supplementary Fig.   6A). On the other hand, the group Low corresponds to signal values below the 5th percentile (colored in gray- Supplementary Fig. 6A). There are a total of 156 elements in each percentile cut. From the High group, 100 loci are identified belonging to the A compartment (red area in Supplementary Fig. 6B) which corresponds to 64% agreement of a locus being classified as A type and having a high signal value of the ATAC-seq track. In addition, there are 91 loci identified as B compartments (blue area in Supplementary Fig. 6B) that give 59% hits.
Supplementary Fig. 6: A -ATAC-seq normalized (min-max normalization) signal distribution. High (green) and Low (gray) correspond to 95th and 5th percentile, respectively. B -Distribution of the first component of the eigenvector of the Pearson correlation matrix extracted from the Hi-C map of chromosome 1 of the Aedes aegypti. Red and blue correspond to A and B chromatin types. C -Color diagram of the EV1 and ATAC-seq signal for each locus chromosome 1 at 100 kb resolution.
Supplementary Table 1: Kolmogorov-Smirnov statistic two-sided test (KS-value) comparing the High and Low ATAC-seq value distributions. The test computes the null hypothesis that two independent data sets are obtained from the same continuous distribution. 12 The distributions are presented in Figure 3D of the main text.

Homopolymer Model
The homopolymer potential employed here describes a generic bead-spring polymer in which each bead represents a genomic segment of 100 kb in sequence. The potential energy U HP ( r) describes a spatially self-avoiding polymer and serves as a support for the features added by using the maximum entropy principle. [1][2][3]5 This potential consists of the following four terms, U FENE , U Angle , U hc and, U sc : U FENE (Finite Extensible Nonlinear Elastic potential) is the bonding term applied between two consecutive monomers, connecting a sequence of beads with nonlinear springs. 13 K b is the spring constant and R 0 is the equilibrium bond length.
Additionally, the hard-core repulsive potential U hc ( r i,j ) is included to avoid overlap between bonded monomers: A three-body interaction is included to all connected three consecutive monomers to consider a nonzero bending stiffness. 14 This angular potential that regulates the chain flexibility is given by: where θ i is the angle defined by two vectors r i,i+1 and r i,i−1 .
All non-bonded pair interactions are described by a soft-core repulsive potential with the following form: The expression U LJ corresponds to the Lennard-Jones potential U LJ ( r i,j ) = 4 capped off at a finite distance, thus allowing for chain crossing at finite energetic cost (topoisomerases activity). r 0 is chosen as the distance at which U LJ ( r i,j ) = 1 2 E cut . The model parameters are included in reduced units as:

Centromere and Telomeres Position Restraints
Centromeric and telomeric regions are anchored to the nucleus wall using a flat-bottomed potential U rfb . This potential is employed to restrain the loci within a simulation volume. If a locus moves outside the chosen region, a harmonic force moves the bead to the flat-bottomed region. On the other hand, there is no force acting on the centromeric and telomeric locus within the flat-bottomed part of the potential. The flat-bottomed potential is described as follows: where R f 0 = 2.5σ is the reference distance of the flat bottom potential, r i is the locus's distance from the reference, k rf b = 0.2 σ 2 the force constant, and Θ is the Heaviside step function. The position restraints applied to centromeres and telomeres to maintain the polarized architecture can be associated with the force/tension of their interaction with the nucleus membrane. Increasing/decreasing the centromere-telomeres spatial distance would mimic the effect of the nucleus deformation by expansion and compression. Supplementary Table 2 shows the set of beads that belong to a certain region for each chromosome.
Supplementary Table 2: Set of beads that belong to a certain region (centromeric or telomeric) for each chromosome. C label the centromere, T1 and T2 labels the two telomeres. Crosslinking Probability Function One challenge for modeling chromosomes based on Hi-C maps is the relationship between the crosslink probability in the experimental and the geometric distance of a loci pair. MiChroM modeling assumes that the probability of a loci pair i and j to form contact in 3D space decreases as a function of their distance r i,j . In this sense, f (r i,j ) is adopted as a sigmoid function (switch function) designed to consider a high probability of crosslinking for short distances while long geometric distance should have a low probability of contact: where µ and r c values are determined based on the experimental Hi-C maps. The adjustment of the parameters considers two criteria: i ) the function f (r i,j ) is calibrated to return 1 when two beads are in contact (distance between the center of two beads is equal to 1, in reduced units σ), i.e., f (1) = 1. ii ) f (r i,j ) is also tuned to correctly return the experimental probability for the nearest neighbors where the maximum distance between the next nearest neighbors is 2 in reduced units σ. f (r i,j ) should decrease monotonically with the distance and the minimum value of the experimental probabilities must match with the next nearest neighbor maximum distance, i.e., f (2) = min{P exp i,i+2 }. The parameters adjusted for the Hi-C maps of Aedes aegypti are µ = 3.48 and r c = 1.76.

Orientation Order Parameter -O OP
The orientation order parameter O OP is defined as the correlation between two unit vectors connecting beads [i, i + 4] and [j, j + 4] and it is described as follows: where −−→ r i,i+4 is the unit vector that points from the i-th to the i + 4-th locus. The brackets